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Strain G2 T sp. nov. is the type strain of B. massiliogorillae, a proposed new species within the 
genus Bacillus. This strain, whose genome is described here, was isolated in France from the 
fecal sample of a wild western lowland gorilla from Cameroon. B. massiliogorillae is a facul- 
tative anaerobic, Gram-variable, rod-shaped bacterium. Here we describe the features of this 
organism, together with the complete genome sequence and annotation. The 5,431,633 bp 
long genome (1 chromosome but no plasmid) contains 5,179 protein-coding and 98 RNA 
genes, including 91 tRNA genes. 



Introduction 

Strain G2 T (= CSUR P206 = DSM 26159) is the type 
strain of B. massiliogorillae sp. nov. This bacterium 
is a Gram-variable, facultatively anaerobic, indole- 
negative bacillus having rounded-ends. It was iso- 
lated from the stool sample of Gorilla gorilla goril- 
la as part of a "culturomics" study aiming at culti- 
vating bacterial species within gorilla feces. 

The genus Bacillus (Cohn 1872] was created about 
140 years ago [1]. To date this genus, comprised 
mostly of Gram-positive, motile, and spore- 
forming bacteria, includes 276 species with validly 
published names [2]. Members of the genus Bacil- 
lus are ubiquitous bacteria isolated from various 
environments including soil, fresh and sea water, 
food, and occasionally from humans and animals 
in which they are either pathogens, such as B. 
anthracis (the causative agent of anthrax] [3] and 
B. cereus (associated mainly with food poisoning) 
[4], or saprophytes [5]. Bacillus species may also 
rarely be involved in a variety of human infec- 
tions, including pneumonia, bacteremia, meningi- 
tis, endocarditis, endophthalmitis, osteomyelitis 
and skin/soft tissue infection [5]. However, in 
great apes, few data are available about the pres- 
ence of the genus Bacillus. Recent reports have 
described the isolation of atypical B. anthracis [B. 
anthracis-like bacteria) in wild chimpanzees and 
gorillas from Africa [6-8]. 
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Here we present a summary classification and a 
set of features for B. massiliogorillae sp. nov. strain 
G2 T together with the description of the complete 
genome sequence and annotation. These charac- 
teristics support the circumscription of the spe- 
cies B. massiliogorillae [9]. 

Classification and features 

In July 2011, a fecal sample was collected from a 
wild western lowland gorilla near Messok, a vil- 
lage in the south-eastern part of the DJA FAUNAL 
Park (Cameroon). The collection of the stool sam- 
ple was approved by the Ministry of Scientific Re- 
search and Innovation of Cameroon. No experi- 
mentation was conducted on this gorilla. The fecal 
specimen was preserved at -80°C after collection 
and sent to Marseille. Strain G2 T (Table 1) was iso- 
lated in January 2012 by cultivation on Brucella 
agar medium (Oxoid, Dardilly, France). This strain 
exhibited a 97.3% 16S rRNA nucleotide sequence 
similarity with Bacillus simplex, the 
phylogenetically closest validly published Bacillus 
species (Figure 1). This value was lower than the 
98.7% 16S rRNA gene sequence threshold rec- 
ommended by Stackebrandtia and Beers to delin- 
eate a new species without carrying out DNA-DNA 
hybridization [23]. 
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Table 1 . Classification and general features of Bacillus massiliogorillae strain G2 T 



MIGS ID Property 



Term 



Evidence code 3 



Current classification 



Domain Bacteria 
Phylum Firmicutes 

Class Bacilli 

Order Bacillales 

Family Bacillaceae 

Genus Bacillus 

Species Bacillus massiliogorillae 
Type strain G2 T 



TAS [1 0] 
TAS [11-13] 

TAS [14,15] 

TAS [16,17] 

TAS [16,18] 

TAS [16,19,20] 

IDA 

IDA 





Gram stain 


Variable 


i a 
IDA 




Cell shape 


Kod 


i r~\ a 

IDA 




Motility 


Motile 


IDA 




Sporulation 


Sporulating 


IDA 




Temperature range 


Mesophile 


IDA 




Optimum temperature 


37°C 


IDA 


MIGS-6.3 


Salinity 


Growth in BHI medium + 2% NaCI 


IDA 


MIGS-22 


Oxygen requirement 


Facultative anaerobic 


IDA 




Carbon source 


■ i i 

Unknown 


NAS 




Energy source 


Unknown 


NAS 


MIGS-6 


Habitat 


Gorilla gut 


IDA 


MIGS-15 


Biotic relationship 


Free living 


IDA 


MIGS-14 


Pathogenicity 


Unknown 


NAS 




Biosafety level 


2 


NAS 




Isolation 


Gorilla feces 










NAS 


MIGS-4 


Geographic location 


Cameroon 


IDA 


MIGS-5 


Sample collection time 


July 2011 


IDA 


MIGS-4. 1 


Latitude 


Unknown 


NAS 


MIGS-4. 1 


Longitude 


Unknown 


NAS 


MIGS-4.3 


Depth 


Unknown 


NAS 


MIGS-4.4 


Altitude 


Unknown 


NAS 



Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report ex- 
ists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolat- 
ed sample, but based on a generally accepted property for the species, or anecdotal evidence). These evi- 
dence codes are from the Gene Ontology project [21]. If the evidence is IDA, then the property was direct- 
ly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgements. 
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39 
18 
15 



23 



62 



52 



74 



Bacillus infernus (NR027227) 

-Bacillus methanolicus (NR 040985) 
-Bacillus fumarioli (FJ973527) 



56 



54 



-Bacillus jeotgali (JX094165) 

— Bacillus lentus (NR040792) 
Bacillus sm/f/7/V(NR036987) 

eac/7/us badius (EU717967) 



40 



78 



100 



Bacillus firmus (JX428993) 

Bacillus benzoevorans (Y14693) 

Bacillus nealsoniis (NR044546) 



97 



87 



100 



1 — Bacillus circulans (JF833093) 
Bacillus massiliogorillae ( JX650055) 

— Bacillus simplex (HQ3271 14) 

Bacillus psychrosaccharolyticus (JX429005) 

I — Bacillus flexus (JN033557) 

Bacillus megaterium (FJ685764) 
_ Bacillus thuringiensis (HF545324) 

Clostridium botulinum (AM412317) 



Figure 1. Phylogenetic tree highlighting the position of Bacillus massiliogorillae strain G2 T relative to other 
type strains within the Bacillus genus. GenBank accession numbers are indicated in parentheses. Se- 
quences were aligned using CLUSTAL X (V2), and phylogenetic inferences obtained using the maximum- 
likelihood method within the MEGA 5 software [22]. Numbers at the nodes are percentages of bootstrap 
values obtained by repeating the analysis 1,000 times to generate a majority consensus tree. Clostridium 
botulinum was used as outgroup. The scale bar represents a 2% nucleotide sequence divergence. 



Different growth temperatures (25, 30, 37, 45°C] 
were tested. Growth occurred at all tested tem- 
peratures, and the optimal growth was observed 
at 37°C. Colonies were 2-5 mm in diameter on Co- 
lumbia agar, grey opaque in color. Growth of the 
strain was tested under anaerobic and 
microaerophilic conditions using GENbag anaer 
and GENbag microaer systems, respectively 
(BioMerieux], and in aerobic conditions, with or 
without 5% CO2. Growth was achieved under aer- 
obic (with and without CO2], microaerophilic and 
anaerobic conditions. Gram staining showed Gram 
variable bacilli (Figure 2]. A motility test was posi- 
tive. Cells grown on agar sporulate and the rods 
have a length ranging from 3.2 to 7.5 \im (mean 
5.4 |im] and a diameter ranging from 0.8 to 1.2 |im 
(mean 1 \im) as determined by negative staining 
transmission electron microscopy (Figure 3]. 



Strain G2 T exhibited catalase activity but not oxi- 
dase activity. Using the API 50CH system 
(BioMerieux], a positive reaction was observed for 
D-glucose, D-fructose, D-ribose, N- 
acetylglucosamine, amygdalin, arbutin, aesculin, 
salicin, cellobiose, maltose, D-lactose, D-trehalose, 
D-saccharose, and hydrolysis of starch. Using the 
API ZYM system, positive reactions were observed 
for esterase (C4], esterase lipase (C8], phospha- 
tase acid, a- glucosidase and N-acetyl-(B- 
glucosaminidase. The urease reaction was also 
positive, but nitrate reduction and indole produc- 
tion were negative. B. massiliogorillae is suscepti- 
ble to amoxicillin, nitrofurantoin, erythromycin, 
doxycycline, rifampin, vancomycin, gentamycin 
and imipenem but resistant to trimethoprim- 
sulfamethoxazole, ciprofloxacin, ceftriaxon and 
amoxicillin-clavulanic acid. 
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Figure 3. Transmission electron microscopy of B. massiliogorillae strain G2 T , using a Morgani 268D 
(Philips) at an operating voltage of 60kV. The scale bar represents 1 urn. 
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Table 2. Differential phenotypic characteristics between B. 
close Bacillus species. 



Keita ef al. 

massiliogorillae sp. nov. strain G2 T and phylogenetically 



Characteristic 



B. massiliogorillae sp. nov. B. simplex B. psychrosaccharolyticus B. circulans 



Cell diameter (\im) 
Oxygen requirement 
Gram stain 
Salt requirement 
Motility 

Endospore formation 

Production of 

Alkaline phosphatase 
Acid phosphatase 
Catalase 
Oxidase 

Nitrate reductase 
Urease 

a-galactosidase 

P-galactosidase 

P -glucuronidase 

a -glucosidase 

N-acetyl- Pglucosaminidase 

Indole 

Esterase 

Esterase lipase 

Naphthyl-AS-BI- 
phosphohydrolase 

Phenylalanine arylamidase 

Leucine arylamidase 
Cystine arylamidase 
Valine arylamidase 
Glycine arylamidase 

Utilization of 

D-mannose 

Amygdalin 

L-Arabinose 

Cellobiose 

Lactose 

D-xylose 

Glucose 

Mannitol 

Arabinose 

Xylose 

Glycerol 

D-Galactose 

Starch 

Habitat 



0.87-1.2 
aerobic 
var 

< 5% 



+ 
+ 
+ 



+ 
+ 



+ 
+ 



+ 
+ 



0.7-0.9 
aerobic 
var 

<7% 
v 
+ 

na 
na 

+ 

na 
na 
na 
na 
na 
na 
na 
na 
na 
na 

na 

na 

na 
na 
na 
na 



gorilla gut 



-/w 
na 
na 
na 
na 
na 
na 
na 

soil 



0.9-1 

facultative anaerobic 
var 
<10% 
+ 



na 
na 
+ 
na 
+ 
na 
na 
+ 
na 
na 
na 
na 
na 
na 

na 

na 

na 
na 
na 
na 



na 
na 

+ 
na 

+ 
na 

+ 

+ 

+ 

+ 

+ 
na 
na 

soil and lowland marshes 



0.5-0.8 
aerobic 
var 

<7% 
+ 
+ 

na 
na 

+ 
na 
na 
w 
na 

+ 
na 
na 
na 
na 
na 
na 

na 

na 

na 
na 
na 
na 

+ 
v 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 

environment 
and fish gut 



var: variable, +: positive result, -: negative result, na: data not available, w: weak positive result 
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When compared to other Bacillus species, B. 
massiliogorillae differed from B. simplex [24] for 
the utilization of amygdalin, cellobiose, lactose 
and glucose (Table 2]. It also differed from B. 
psychrosaccharolyticus [25] in nitrate reductase 
and (B-galactosidase production, and in the utiliza- 
tion of L-arabinose, mannitol, xylose and glycerol 
(Table 2]. Differences were also observed with B. 
circulans [26] in (B-galactosidase production and 
the utilization of D-mannose, L-arabinose, D- 
xylose, mannitol, arabinose, xylose, glycerol and 
D-galactose (Table 2). 

Matrix-assisted laser-desorption/ionization time- 
of-flight (MALDI-TOF] MS protein analysis was 
carried out as previously described [27,28]. De- 
posits were done for strain G2 T from 12 isolated 
colonies. Each smear was overlaid with 2uL of ma- 
trix solution (saturated solution of alpha-cyano-4- 
hydroxycinnamic acid] in 50% acetonitrile, 2.5% 
tri-fluoracetic-acid, and allowed to dry for five 
minutes. Measurements were performed with a 
Microflex spectrometer (Bruker Daltonics, Leipzig, 
Germany]. Spectra were recorded in the positive 
linear mode for the mass range of 2,000 to 20,000 
Da (parameter settings: ion source 1 (IS1], 20 kV; 
IS2, 18.5 kV; lens, 7 kV]. A spectrum was obtained 
after 675 shots at a variable laser power. The time 
of acquisition was between 30 seconds and 1 mi- 
nute per spot. The 12 G2 T spectra were imported 
into the MALDI BioTyper software (version 2.0, 
Bruker] and analyzed by standard pattern match- 
ing (with default parameter settings] against 
6,252 bacterial spectra including 199 spectra from 
104 Bacillus species, used as reference data, in the 
BioTyper database. The method of identification 
included the m/z from 3,000 to 15,000 Da. For 
every spectrum, 100 peaks at most were taken 
into account and compared with spectra in the 
database. A score enabled the identification, or 
not, from the tested species: a score > 2 with a val- 
idated species enabled the identification at the 
species level, a score > 1.7 but < 2 enabled the 
identification at the genus level; and a score < 1.7 
did not enable any identification. For strain G2 T , 
the scores obtained ranged from 1.177 to 1.343, 
thus suggesting that our isolate was not a member 
of a known species. We incremented our database 
with the spectrum from strain G2 T (Figure 4). 
Spectrum differences with other of Bacillus spe- 
cies are shown in Figure 5. 



Genome sequencing information 

Genome project history 

The organism was selected for sequencing on the 
basis of its phylogenetic position and 16S rRNA simi- 
larity to other members of the genus Bacillus, and is 
part of a "culturomics" study of the gorilla flora aim- 
ing at isolating all bacterial species within gorilla 
feces. It was the 61 st genome of a Bacillus species and 
the first genome of Bacillus massiliogorillae sp. nov. 
A summary of the project information is shown in 
Table 2. The Genbank accession number is 
CAVL000000000 and consists of 66 large contigs. 
Table 3 shows the project information and its asso- 
ciation with MIGS version 2.0 compliance [29]. 

Growth conditions and DNA isolation 

B. massiliogorillae sp. nov. strain G2 T , CSUR P206, 
DSM 26159, was grown aerobically on 5% sheep 
blood-enriched Columbia agar at 37°C. Four petri 
dishes were spread and resuspended in 3x500^1 of 
TE buffer and stored at 80°C. Then, 500^1 of this 
suspension were thawed, centrifuged 3 minutes at 
10,000 rpm and resuspended in 3xl00[iL of G2 
buffer (EZ1 DNA Tissue kit, Qiagen]. A first me- 
chanical lysis was performed by glass powder on 
the Fastprep-24 device (Sample Preparation sys- 
tem, MP Biomedicals, USA] using 2x20 seconds cy- 
cles. DNA was then treated with 2.5\ig/\iL lysozyme 
(30 minutes at 37°C] and extracted using the 
BioRobot EZ1 Advanced XL (Qiagen]. The DNA was 
then concentrated and purified using the Qiamp kit 
(Qiagen]. The yield and the concentration was 
measured by the Quant-it Picogreen kit (Invitro- 
gen] on the Genios Tecan fluorometer at 50ng/|il. 

Genome sequencing and assembly 

The paired-end library was prepared with 5 \ig of 
bacterial DNA using the DNA fragmentation on the 
Covaris S-Series (SI, S2] instrument (Woburn, Mas- 
sachusetts, USA] with an enrichment size at 3-5-kb. 
The DNA fragmentation was visualized through the 
Agilent 2100 BioAnalyzer on a DNA labchip 7500. 
The library was constructed according to the 454 GS 
FLX Titanium paired-end protocol (Roche]. Circular- 
ization and nebulization were performed and gener- 
ated a pattern with an optimum at 500 bp. After PCR 
amplification through 15 cycles followed by double 
size selection, the single stranded paired-end library 
was quantified using the Quant-it Ribogreen kit 
(Invitrogen] on the Genios Tecan fluorometer at 339 
pg/pL. The library concentration equivalence was 
calculated as 1.00E+08 molecules/pl. The library 
was stored at -20°C until further use. 
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Figure 4. Reference mass spectrum from B. massiliogorillae strain G2 T . Spectra from 12 individual colonies were 
compared and a reference spectrum was generated. 



The paired-end library was clonally amplified 
with 0.5 cpb and 1 cpb in 2 emPCR reactions with 
the GS Titanium SV emPCR Kit (Lib-L] v2 (Roche]. 
The yield of the emPCR was 19.4%, slightly above 
the expected yield ranging from 5 to 20% recom- 
mended by the Roche procedure. 

Approximately 790,000 beads for a % region were 
loaded on the GS Titanium PicoTiterPlate PTP Kit 
70x75 and sequenced with the GS FLX Titanium 
Sequencing Kit XLR70 (Roche]. The run was per- 
formed overnight and then analyzed on the cluster 
through the gsRunBrowser and Newbler assem- 
bler (Roche]. A total of 322,962 passed filter wells 
were obtained and generated 64.2 Mb of sequenc- 
es with a length average of 310 bp. The passed 
filter sequences were assembled using Newbler 
with 90% identity and 40 bp as overlap. The final 



assembly identified 60 scaffolds generating a ge- 
nome size of 4.6 Mb. 

Genome annotation 

Open Reading Frames (ORFs] were predicted us- 
ing Prodigal [30] with default parameters but the 
predicted ORFs were excluded if they spanned a 
sequencing gap region. The predicted bacterial 
protein sequences were searched against the 
GenBank database [31] and the Clusters of Orthol- 
ogous Groups (COG] databases using BLASTP. The 
tRNAScanSE tool [32] was used to find tRNA 
genes, whereas ribosomal RNAs were found by 
using RNAmmer [33] and BLASTn against the 
GenBank database. ORFans were identified if their 
BLASTP £-value was lower than le-03 for align- 
ment length greater than 80 amino acids. If align- 
ment lengths were smaller than 80 amino acids, 
we used an £-value of le-05. 
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To estimate the mean level of nucleotide sequence 
similarity at the genome level between B. 
massiliogorillae sp nov. strain G2 T and another 3 
Bacillus species (Table 6], we compared genomes 
pairwise and determined the mean percentage of 
nucleotide sequence identity among orthologous 
ORFs using BLASTn. Orthologous genes were de- 
tected using the Proteinortho software [34]. 

Figure 5. Gel view comparing Bacillus 
massiliogorillae G2 T spectra with other members 

of the Bacillus genus (6. thuringiensis, B. smithii, B. 
simplex, B. psychrosaccharolyticus, B. nealsonii, B. 



Spectruj^ Ji 



15- 
14 
13- 
12- 
11- 
10 

9 

8 

7 
6 
5 
4 
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2 
1 



megaterium, B. lentus, B. flexus, B. firmus, B. circulans 
and B. benzoevorans). The Gel View displays the 
raw spectra of all loaded spectrum files arranged 
in a pseudo-gel like look. The x-axis records the 
m/z value. The left y-axis displays the running 
spectrum number originating from subsequent 
spectra loading. The peak intensity is expressed 
by a Gray scale scheme code. The color bar and the 
right y-axis indicate the relation between the color 
a peak is displayed with and the peak intensity in 
arbitrary units. 



— i — 

6000 



■0 6 



I 0 



-0 3 



Bacillus thuringiensis DSM 2046T DSM 

Bacillus smithiiDSM 421 6T DSM 

Bacillus smithii C\ P103790TCIP 

Bacillus simplex DSM 1 321T DSM 

Bacillus simplex CS 206_1 a I BR B 

Bacillus psychrosaccharolyticus DSM 6T DSM 

Bacillus nealsonii DSM 1 5077T DSM 

Bacillus megaterium DSM 32T DSM 

Bacillus massiliogorillae G2T 

Bacillus lentus DSM 9T DSM 

Bacillus flexus DSM 1 320T DSM 

Bacillus flexus 1 00331_30 USP 

Bacillus firmus DSM 1 21 DSM 

Bacillus circulans DSM 1 1T DSM 

Bacillus circulansCS 220_1a BRB 

Bacillus benzoevorans DSM 5391T DSM 



Table 3. Project information 



MIGS ID 


Property 


Term 


MIGS-31 


Finishing quality 


High-quality draft 


MIGS-28 


Libraries used 


454 paired-end 3- kb libraries 


MIGS-29 


Sequencing platform 


454 GS FLX Titanium 


MIGS-31. 2 


Sequencing coverage 


13x 


MIGS-30 


Assemblers 


Newbler version 2.5.3 


MIGS-32 


Gene calling method 


Prodigal 




EMBL Date of Release 


April 18, 2013 




EMBL ID 


CAVL000000000 


MIGS-13 


Project relevance 


Study of the gorilla gut microbiome 
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Genome properties 

The genome is 5,431,633 bp long (1 chromosome, 
but no plasmid] with a 34.95% G+C content (Fig- 
ure 6 and Table 5]. It is composed of 66 large 
contigs. Of the 5,276 predicted genes, 5,179 were 
protein-coding genes and 98 were RNAs (1 16S 
rRNA, 1 23S rRNA gene, 5 5S rRNA genes and 91 
tRNA genes). A total of 3,801 genes (73.39%] 
were assigned a putative function (by COGS or by 



NR BLAST] and 368 genes were identified as 
ORFans (7.11%]. The remaining genes were anno- 
tated as hypothetical proteins (666 genes, 
12.86%]. The distribution of genes into COGs 
functional categories is presented in Table 6. The 
properties and statistics of the genome are sum- 
marized in Tables 4 and 5. 
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Figure 6. Graphical circular map of the genome. From outside in: contigs (red / grey), COG category of genes on 
the forward strand (three circles), genes on forward strand (blue circle), genes on the reverse strand (red circle), 
COG category on the reverse strand (three circles), GC content. The inner-most circle shows GC skew, purple and 
olive indicating negative and positive values, respectively. 
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Table 4. Nucleotide content and gene count levels of the genome 



A r* ■ 1 1 <t 

/\lulUUlc 


Value 


0/ „f tntal 3 
/o Ul lUlcll 


uenome size (up) 


£ A 1 1 C^'i 1 
J,4j 1 ,Djj 


1 nn 


Coding region (bp) 


4,30 1 ,ZO/ 


oj.yo 


o+l. content {up) 


i ,oyo,4yo 


J4.7J 


Total genes 


J,Z/ O 


1 nn 

I UU 


kina genes 


QQ 

yo 


I .04 


Protein-coding genes 


D, I /y 


no /IT 

yo.oi 


Genes with function prediction 


3,801 


73.39 


Genes assigned to COGs 


3,910 


75.49 


Genes with peptide signals 


610 


11.78 


Genes with transmembrane helices 


1,347 


26.01 



a The total is based on either the size of the genome in base pairs or the total number 
of protein coding genes in the annotated genome 



Table 5. Number of genes associated with the 25 general COG functional categories 



Code 


Value 


% age a 


Description 


J 


180 


3.48 


Translation, ribosomal structure and biogenesis 


A 


0 


0 


RNA processing and modification 


K 


438 


8.46 


Transcription 


L 


191 


3.69 


Replication, recombination and repair 


B 


2 


0.04 


Chromatin structure and dynamics 


D 


42 


0.81 


Cell cycle control, mitosis and meiosis 


Y 


0 


0 


Nuclear structure 


V 


110 


2.12 


Defense mechanisms 


T 


275 


5.31 


Sipnal transduction mechanisms 


M 


182 


3.51 


Cell wall/membrane biogenesis 


N 


88 


1.7 


Cell motility 


Z 


0 


0 


Cytoskeleton 


W 


0 


0 


Extracellular structures 


u 


63 


1.22 


Intracellular trafficking and secretion 


o 


130 


2.51 


Posttranslational modification, protein turnover, chaperones 


c 


293 


5.66 


Energy production and conversion 


G 


247 


4.77 


Carbohydrate transport and metabolism 


E 


474 


9.15 


Amino acid transport and metabolism 


F 


110 


2.12 


Nucleotide transport and metabolism 


H 


177 


3.42 


Coenzyme transport and metabolism 


I 


188 


3.63 


Lipid transport and metabolism 


P 


300 


5.79 


Inorganic ion transport and metabolism 


Q 


133 


2.57 


Secondary metabolites biosynthesis, transport and catabolism 


R 


664 


12.82 


General function prediction only 


S 


344 


6.64 


Function unknown 




1,269 


24.50 


Not in COGs 


a The total is based on the total 


number of protein coding genes in the annotated genome 
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Comparison with other Bacillus species 
genomes 

Here, we compared the genome of B. massiliogorillae 
strain G2 T with those of B. psychrosaccharolyticus 
strain ATCC 23296, B. megaterium strain DSM 319 
and B. thuringiensis strain ATCC 10792 (Table 6). 
The draft genome of B. massiliogorillae is larger in 
size than those of B. psychrosaccharolyticus and B. 
megaterium (5.43 vs 4.59 and 5.1 Mb, respectively] 
and smaller in size than that of B. thuringiensis (5.43 
vs 6.26 Mb]. B. massiliogorillae has a lower G+C con- 
tent than B. psychrosaccharolyticus (34.95% vs 
38.8%) and B. megaterium (34.95% vs 38.1%] but 
slightly higher than that B. thuringiensis (34.95% vs 
34.8%). The protein content of B. massiliogorillae is 
higher than those of B. psychrosaccharolyticus and B. 



megaterium (5,179 vs 4,832 and 5,100 respectively) 
and fewer than that of B. thuringiensis (5,179 vs 
6,243) (Table 6). In addition, B. massiliogorillae 
shares 1,936, 1,966 and 1,877 orthologous genes 
with B. psychrosaccharolyticus, B. megaterium and B. 
thuringiensis respectively (Table 6). The nucleotide 
sequence identity of orthologous genes ranges from 
68.46 to 70.15% among Bacillus species, and from 
69.28 to 70.15% between B. massiliogorillae and 
other Bacillus species (Table 6), thus confirming its 
new species status. Table 6 summarizes the number 
of orthologous genes and the average percentage of 
nucleotide sequence identity between the different 
genomes studied. 



Table 6. The number of orthologous proteins shared between genomes + 





B. massiliogorillae 


B. psychrosaccharolyticus 


B. megaterium 


B. thuringiensis 


B. massiliogorillae 


5,179 


70.15 


69.28 


69.66 


B. psychrosaccharolyticus 


1,936 


4,832 


68.74 


68.46 


B. megaterium 


1,966 


1,962 


5,100 


69.86 


B. thuringiensis 


1,877 


1,873 


1,903 


6,243 



f Lower left triangle- shared orthologous, upper right triangle- average percentage similarity of nucleotides correspond- 
ing to orthologous proteins shared between genomes, bold- number of proteins per genome 



Conclusion 

On the basis of phenotypic (Table 2), phylogenetic 
and genomic analyses (taxonogenomics) (Table 
6), we formally propose the creation of Bacillus 
massiliogorillae sp. nov. that contains the strain 
G2 T . This strain has been found in a stool sample 
collected from gorilla in Cameroon. 

Description of Bacillus massiliogorillae sp. nov. 

Bacillus massiliogorillae (ma.sil.io.go.ril'ae. L. gen. 
masc. n. massiliogorillae, combination of Massilia, 
the Latin name of Marseille, where strain G2 T was 
isolated, and of Gorilla, the Latin name of the goril- 
la, from which the stool sample was obtained). 

B. massiliogorillae is an aerobic Gram-variable bacte- 
rium. Optimal growth is achieved aerobically. No 
growth is observed in microaerophilic or anaerobic 
conditions. Growth occurs on axenic media between 
25 and 45°C, with optimal growth observed at 37°C. 
Cells stain Gram-positive or negative, are rod- 
shaped, endospore-forming, motile and have a mean 
diameter of 1 u,m (range 0.8 to 1.2 u,m) and a mean 
length of 5.4 u,m (range 3.2 to 7.5 u,m). Colonies are 
grey opaque and 2-5 mm in diameter on blood- 
enriched BHI agar. 



Catalase positive but oxidase negative. Using the API 
50CH system (BioMerieux), a positive reaction is 
obtained for D-glucose, D-fructose, D-ribose, N- 
acetylglucosamin, amygdalin, arbutin, aesculin, 
salicin, cellobiose, maltose, D-lactose, D-trehalose, D- 
saccharose, and hydrolysis of starch. Using the API 
ZYM system, positive reactions are obtained for es- 
terase (C4), esterase lipase (C8), phosphatase acid, 
a- glucosidase and N-acetyl-(B-glucosaminidase. Us- 
ing API 20NE, there are neither nitrate reduction nor 
indole production but urease reaction was positive. 
Susceptible to amoxicillin, nitrofurantoin, erythro- 
mycin, doxycycline, rifampin, vancomycin, gentamy- 
cin and imipenem but resistant to trimethoprim- 
sulfamethoxazole, ciprofloxacin, ceftriaxon and 
amoxicillin-clavulanic acid. 

The G+C content of the genome is 34.95%. The 16S 
rRNA and genome sequences are deposited in 
GenBank under accession numbers JX650055 and 
CAVL00000000, respectively. The type strain G2 T (= 
CSUR P206 = DSM 26159) was isolated from the fe- 
cal flora of a Gorilla gorilla gorilla from Cameroon. 
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